<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<meta name="generator" content="Doxygen 1.9.1"/>
<meta name="viewport" content="width=device-width, initial-scale=1"/>
<title>Eigen: Writing efficient matrix product expressions</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="dynsections.js"></script>
<link href="navtree.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="resize.js"></script>
<script type="text/javascript" src="navtreedata.js"></script>
<script type="text/javascript" src="navtree.js"></script>
<link href="search/search.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="search/searchdata.js"></script>
<script type="text/javascript" src="search/search.js"></script>
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
  $(document).ready(function() { init_search(); });
/* @license-end */
</script>
<script type="text/x-mathjax-config">
  MathJax.Hub.Config({
    extensions: ["tex2jax.js", "TeX/AMSmath.js", "TeX/AMSsymbols.js"],
    jax: ["input/TeX","output/HTML-CSS"],
});
</script>
<script type="text/javascript" async="async" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js"></script>
<link href="doxygen.css" rel="stylesheet" type="text/css" />
<link href="eigendoxy.css" rel="stylesheet" type="text/css">
<!--  -->
<script type="text/javascript" src="eigen_navtree_hacks.js"></script>
</head>
<body>
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<div id="titlearea">
<table cellspacing="0" cellpadding="0">
 <tbody>
 <tr style="height: 56px;">
  <td id="projectlogo"><img alt="Logo" src="Eigen_Silly_Professor_64x64.png"/></td>
  <td id="projectalign" style="padding-left: 0.5em;">
   <div id="projectname"><a href="http://eigen.tuxfamily.org">Eigen</a>
   &#160;<span id="projectnumber">3.4.90 (git rev 67eeba6e720c5745abc77ae6c92ce0a44aa7b7ae)</span>
   </div>
  </td>
   <td>        <div id="MSearchBox" class="MSearchBoxInactive">
        <span class="left">
          <img id="MSearchSelect" src="search/mag_sel.svg"
               onmouseover="return searchBox.OnSearchSelectShow()"
               onmouseout="return searchBox.OnSearchSelectHide()"
               alt=""/>
          <input type="text" id="MSearchField" value="Search" accesskey="S"
               onfocus="searchBox.OnSearchFieldFocus(true)" 
               onblur="searchBox.OnSearchFieldFocus(false)" 
               onkeyup="searchBox.OnSearchFieldChange(event)"/>
          </span><span class="right">
            <a id="MSearchClose" href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" border="0" src="search/close.svg" alt=""/></a>
          </span>
        </div>
</td>
 </tr>
 </tbody>
</table>
</div>
<!-- end header part -->
<!-- Generated by Doxygen 1.9.1 -->
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
var searchBox = new SearchBox("searchBox", "search",false,'Search','.html');
/* @license-end */
</script>
</div><!-- top -->
<div id="side-nav" class="ui-resizable side-nav-resizable">
  <div id="nav-tree">
    <div id="nav-tree-contents">
      <div id="nav-sync" class="sync"></div>
    </div>
  </div>
  <div id="splitbar" style="-moz-user-select:none;" 
       class="ui-resizable-handle">
  </div>
</div>
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
$(document).ready(function(){initNavTree('TopicWritingEfficientProductExpression.html',''); initResizable(); });
/* @license-end */
</script>
<div id="doc-content">
<!-- window showing the filter options -->
<div id="MSearchSelectWindow"
     onmouseover="return searchBox.OnSearchSelectShow()"
     onmouseout="return searchBox.OnSearchSelectHide()"
     onkeydown="return searchBox.OnSearchSelectKey(event)">
</div>

<!-- iframe showing the search results (closed by default) -->
<div id="MSearchResultsWindow">
<iframe src="javascript:void(0)" frameborder="0" 
        name="MSearchResults" id="MSearchResults">
</iframe>
</div>

<div class="PageDoc"><div class="header">
  <div class="headertitle">
<div class="title">Writing efficient matrix product expressions </div>  </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p>In general achieving good performance with <a class="el" href="namespaceEigen.html" title="Namespace containing all symbols from the Eigen library.">Eigen</a> does no require any special effort: simply write your expressions in the most high level way. This is especially true for small fixed size matrices. For large matrices, however, it might be useful to take some care when writing your expressions in order to minimize useless evaluations and optimize the performance. In this page we will give a brief overview of the <a class="el" href="namespaceEigen.html" title="Namespace containing all symbols from the Eigen library.">Eigen</a>'s internal mechanism to simplify and evaluate complex product expressions, and discuss the current limitations. In particular we will focus on expressions matching level 2 and 3 BLAS routines, i.e, all kind of matrix products and triangular solvers.</p>
<p>Indeed, in <a class="el" href="namespaceEigen.html" title="Namespace containing all symbols from the Eigen library.">Eigen</a> we have implemented a set of highly optimized routines which are very similar to BLAS's ones. Unlike BLAS, those routines are made available to user via a high level and natural API. Each of these routines can compute in a single evaluation a wide variety of expressions. Given an expression, the challenge is then to map it to a minimal set of routines. As explained latter, this mechanism has some limitations, and knowing them will allow you to write faster code by making your expressions more <a class="el" href="namespaceEigen.html" title="Namespace containing all symbols from the Eigen library.">Eigen</a> friendly.</p>
<h1><a class="anchor" id="GEMM"></a>
General Matrix-Matrix product (GEMM)</h1>
<p>Let's start with the most common primitive: the matrix product of general dense matrices. In the BLAS world this corresponds to the GEMM routine. Our equivalent primitive can perform the following operation: \( C.noalias() += \alpha op1(A) op2(B) \) where A, B, and C are column and/or row major matrices (or sub-matrices), alpha is a scalar value, and op1, op2 can be transpose, adjoint, conjugate, or the identity. When <a class="el" href="namespaceEigen.html" title="Namespace containing all symbols from the Eigen library.">Eigen</a> detects a matrix product, it analyzes both sides of the product to extract a unique scalar factor alpha, and for each side, its effective storage order, shape, and conjugation states. More precisely each side is simplified by iteratively removing trivial expressions such as scalar multiple, negation and conjugation. <a class="el" href="classEigen_1_1Transpose.html" title="Expression of the transpose of a matrix.">Transpose</a> and <a class="el" href="classEigen_1_1Block.html" title="Expression of a fixed-size or dynamic-size block.">Block</a> expressions are not evaluated and they only modify the storage order and shape. All other expressions are immediately evaluated. For instance, the following expression: </p><div class="fragment"><div class="line">m1.noalias() -= s4 * (s1 * m2.adjoint() * (-(s3*m3).conjugate()*s2))  </div>
</div><!-- fragment --><p> is automatically simplified to: </p><div class="fragment"><div class="line">m1.noalias() += (s1*s2*<a class="code" href="namespaceEigen.html#ab84f39a06a18e1ebb23f8be80345b79d">conj</a>(s3)*s4) * m2.adjoint() * m3.conjugate() </div>
<div class="ttc" id="anamespaceEigen_html_ab84f39a06a18e1ebb23f8be80345b79d"><div class="ttname"><a href="namespaceEigen.html#ab84f39a06a18e1ebb23f8be80345b79d">Eigen::conj</a></div><div class="ttdeci">const Eigen::CwiseUnaryOp&lt; Eigen::internal::scalar_conjugate_op&lt; typename Derived::Scalar &gt;, const Derived &gt; conj(const Eigen::ArrayBase&lt; Derived &gt; &amp;x)</div></div>
</div><!-- fragment --><p> which exactly matches our GEMM routine.</p>
<h2><a class="anchor" id="GEMM_Limitations"></a>
Limitations</h2>
<p>Unfortunately, this simplification mechanism is not perfect yet and not all expressions which could be handled by a single GEMM-like call are correctly detected. </p><table class="manual" style="width:100%">
<tr>
<th>Not optimal expression </th><th>Evaluated as </th><th>Optimal version (single evaluation) </th><th>Comments  </th></tr>
<tr>
<td><div class="fragment"><div class="line">m1 += m2 * m3; </div>
</div><!-- fragment --> </td><td><div class="fragment"><div class="line">temp = m2 * m3;</div>
<div class="line">m1 += temp; </div>
</div><!-- fragment --> </td><td><div class="fragment"><div class="line">m1.noalias() += m2 * m3; </div>
</div><!-- fragment --> </td><td>Use .noalias() to tell <a class="el" href="namespaceEigen.html" title="Namespace containing all symbols from the Eigen library.">Eigen</a> the result and right-hand-sides do not alias. Otherwise the product m2 * m3 is evaluated into a temporary.  </td></tr>
<tr class="alt">
<td></td><td></td><td><div class="fragment"><div class="line">m1.noalias() += s1 * (m2 * m3); </div>
</div><!-- fragment --> </td><td>This is a special feature of <a class="el" href="namespaceEigen.html" title="Namespace containing all symbols from the Eigen library.">Eigen</a>. Here the product between a scalar and a matrix product does not evaluate the matrix product but instead it returns a matrix product expression tracking the scalar scaling factor. <br  />
 Without this optimization, the matrix product would be evaluated into a temporary as in the next example.  </td></tr>
<tr>
<td><div class="fragment"><div class="line">m1.noalias() += (m2 * m3).adjoint(); </div>
</div><!-- fragment --> </td><td><div class="fragment"><div class="line">temp = m2 * m3;</div>
<div class="line">m1 += temp.adjoint(); </div>
</div><!-- fragment --> </td><td><div class="fragment"><div class="line">m1.noalias() += m3.adjoint()</div>
<div class="line">               * m2.adjoint(); </div>
</div><!-- fragment --> </td><td>This is because the product expression has the EvalBeforeNesting bit which enforces the evaluation of the product by the Tranpose expression.  </td></tr>
<tr class="alt">
<td><div class="fragment"><div class="line">m1 = m1 + m2 * m3; </div>
</div><!-- fragment --> </td><td><div class="fragment"><div class="line">temp = m2 * m3;</div>
<div class="line">m1 = m1 + temp; </div>
</div><!-- fragment --> </td><td><div class="fragment"><div class="line">m1.noalias() += m2 * m3; </div>
</div><!-- fragment --> </td><td>Here there is no way to detect at compile time that the two m1 are the same, and so the matrix product will be immediately evaluated.  </td></tr>
<tr>
<td><div class="fragment"><div class="line">m1.noalias() = m4 + m2 * m3; </div>
</div><!-- fragment --> </td><td><div class="fragment"><div class="line">temp = m2 * m3;</div>
<div class="line">m1 = m4 + temp; </div>
</div><!-- fragment --> </td><td><div class="fragment"><div class="line">m1 = m4;</div>
<div class="line">m1.noalias() += m2 * m3; </div>
</div><!-- fragment --> </td><td>First of all, here the .noalias() in the first expression is useless because m2*m3 will be evaluated anyway. However, note how this expression can be rewritten so that no temporary is required. (tip: for very small fixed size matrix it is slightly better to rewrite it like this: m1.noalias() = m2 * m3; m1 += m4;  </td></tr>
<tr class="alt">
<td><div class="fragment"><div class="line">m1.noalias() += (s1*m2).block(..) * m3; </div>
</div><!-- fragment --> </td><td><div class="fragment"><div class="line">temp = (s1*m2).block(..);</div>
<div class="line">m1 += temp * m3; </div>
</div><!-- fragment --> </td><td><div class="fragment"><div class="line">m1.noalias() += s1 * m2.block(..) * m3; </div>
</div><!-- fragment --> </td><td>This is because our expression analyzer is currently not able to extract trivial expressions nested in a <a class="el" href="classEigen_1_1Block.html" title="Expression of a fixed-size or dynamic-size block.">Block</a> expression. Therefore the nested scalar multiple cannot be properly extracted.  </td></tr>
</table>
<p>Of course all these remarks hold for all other kind of products involving triangular or selfadjoint matrices. </p>
</div></div><!-- contents -->
</div><!-- PageDoc -->
</div><!-- doc-content -->
<!-- start footer part -->
<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
  <ul>
    <li class="footer">Generated on Thu Apr 21 2022 13:07:55 for Eigen by
    <a href="http://www.doxygen.org/index.html">
    <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.9.1 </li>
  </ul>
</div>
</body>
</html>
